Dialect recognition using a phone-GMM-supervector-based SVM kernel
نویسندگان
چکیده
In this paper, we introduce a new approach to dialect recognition which relies on the hypothesis that certain phones are realized differently across dialects. Given a speaker’s utterance, we first obtain the most likely phone sequence using a phone recognizer. We then extract GMM Supervectors for each phone instance. Using these vectors, we design a kernel function that computes the similarities of phones between pairs of utterances. We employ this kernel to train SVM classifiers that estimate posterior probabilities, used during recognition. Testing our approach on four Arabic dialects from 30s cuts, we compare our performance to five approaches: PRLM; GMM-UBM; our own improved version of GMM-UBM which employs fMLLR adaptation; our recent discriminative phonotactic approach; and a state-of-the-art system: SDC-based GMM-UBM discriminatively trained. Our kernel-based technique outperforms all these previous approaches; the overall EER of our system is 4.9%.
منابع مشابه
Linear and non linear kernel GMM supervector machines for speaker verification
This paper presents a comparison between Support Vector Machines (SVM) speaker verification systems based on linear and non linear kernels defined in GMM supervector space. We describe how these kernel functions are related and we show how the nuisance attribute projection (NAP) technique can be used with both of these kernels to deal with the session variability problem. We demonstrate the imp...
متن کاملAddressing the Data-Imbalance Problem in Kernel-Based Speaker Verification via Utterance Partitioning and Speaker Comparison
GMM-SVM has become a promising approach to textindependent speaker verification. However, a problematic issue of this approach is the extremely serious imbalance between the numbers of speaker-class and impostor-class utterances available for training the speaker-dependent SVMs. This data-imbalance problem can be addressed by (1) creating more speaker-class supervectors for SVM training through...
متن کاملDialect and Accent Recognition Using Phonetic-Segmentation Supervectors
We describe a new approach to automatic dialect and accent recognition which exceeds state-of-the-art performance in three recognition tasks. This approach improves the accuracy and substantially lower the time complexity of our earlier phoneticbased kernel approach for dialect recognition. In contrast to state-of-the-art acoustic-based systems, our approach employs phone labels and segmentatio...
متن کاملText-Independent Speaker Verification via State Alignment
To model the speech utterance at a finer granularity, this paper presents a novel state-alignment based supervector modeling method for text-independent speaker verification, which takes advantage of state-alignment method used in hidden Markov model (HMM) based acoustic modeling in speech recognition. By this way, the proposed modeling method can convert a text-independent speaker verification...
متن کاملA new kernel for SVM MLLR based speaker recognition
Speaker recognition using support vector machines (SVMs) with features derived from generative models has been shown to perform well. Typically, a universal background model (UBM) is adapted to each utterance yielding a set of features that are used in an SVM. We consider the case where the UBM is a Gaussian mixture model (GMM), and maximum likelihood linear regression (MLLR) adaptation is used...
متن کامل